{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "source": [
    "时间序列（time series）数据是一种重要的结构化数据形式，主要由以下几种：\n",
    "● 时间戳（timestamp），特定的时刻\n",
    "● 固定时期（period），如2007年1月或2010年全年\n",
    "● 时间间隔（interval），由起始和结束时间戳表示。时期（period）可以被看做间隔（interval）的特例"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "source": [
    "# 时间和日期数据类型及其工具\n",
    "Python标准库包含用于日期（date）和时间（time）数据的数据类型，而且还有日历方面的功能。我们主要会用到datetime、time以及calendar模块。datetime.datetime（也可以简写为datetime）是用得最多的数据类型。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "datetime.datetime(2024, 11, 29, 11, 12, 12, 296287)"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "from datetime import datetime\n",
    "now = datetime.now()\n",
    "now"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(2024, 11, 29)"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "now.year, now.month, now.day"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "datetime.timedelta(days=926, seconds=56700)"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# datetime以毫秒形式存储日期和时间。timedelta表示两个datetime对象之间的时间差\n",
    "d = datetime(2011,1,7)-datetime(2008,6,24,8,15)\n",
    "d"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(926, 56700)"
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "d.days,d.seconds"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "datetime.datetime(2011, 1, 31, 0, 0)"
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "#可以给datetime对象加上（或减去）一个或多个timedelta，这样会产生一个新对象\n",
    "from  datetime import timedelta\n",
    "start = datetime(2011,1,7)\n",
    "start+timedelta(12)*2"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "source": [
    "# 字符串和datetime的相互转换"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "source": [
    "|代码|\t说明|\n",
    "|---|---|\n",
    "|%Y\t|4位数的年|\n",
    "|%y\t|2位数的年|\n",
    "|%m\t|2位数的月[01,12]|\n",
    "|%d\t|2位数的日[01,31]|\n",
    "|%H\t|时（24小时制）[00,23]|\n",
    "|%I\t|时（12小时制）[01,12]|\n",
    "|%M\t|2位数的分[00,59]|\n",
    "|%S\t|秒[00,61](秒60和61用于闰秒)|\n",
    "|%w\t|用整数表示星期几[0(星期天),6]|\n",
    "|%U\t|每年的第几周[00，53]。星期天被认为是每周的第一天，每年第一个星期天之前的那几天被认为是“第0周”|\n",
    "|%W\t|每年的第几周[00，53]。星期一被认为是每周的第一天，每年第一个星期一之前的那几天被认为是“第0周”|\n",
    "|%z\t|以+HHMM或-HHMM表示的UTC时区偏移量，如果时区为naive，则返回空字符串|\n",
    "|%F\t|%Y-%m-%d简写形式，例如2012-4-18|\n",
    "|%D\t|%m/%d/%y简写形式，例如04/18/12|"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'2011-01-03 00:00:00'"
      ]
     },
     "execution_count": 9,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# 利用str或strftime方法（传入一个格式化字符串），datetime对象和pandas的Timestamp对象（稍后就会介绍）可以被格式化为字符串\n",
    "stamp = datetime(2011,1,3)\n",
    "str(stamp) # 将datetime转为str"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'2011-01-03'"
      ]
     },
     "execution_count": 10,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "stamp.strftime('%Y-%m-%d') # 将datetime转为格式化的str"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "datetime.datetime(2011, 1, 2, 0, 0)"
      ]
     },
     "execution_count": 11,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# datetime.strptime可以用这些格式化编码将字符串转换为日期\n",
    "v  = '2011-01-02'\n",
    "datetime.strptime(v,'%Y-%m-%d') #格式化的str转为datetime"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[datetime.datetime(2011, 7, 6, 0, 0), datetime.datetime(2011, 2, 6, 0, 0)]"
      ]
     },
     "execution_count": 12,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "d = ['7/6/2011','2/6/2011']\n",
    "[datetime.strptime(x,'%m/%d/%Y') for x in d]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "datetime.datetime(2011, 1, 3, 0, 0)"
      ]
     },
     "execution_count": 13,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# datetime.strptime是通过已知格式进行日期解析的最佳方式。但是每次都要编写格式定义是很麻烦的事情，尤其是对于一些常见的日期格式。\n",
    "# 这种情况下，可以用dateutil这个第三方包中的parser.parse方法（pandas中已经自动安装好了）\n",
    "from dateutil.parser import parse\n",
    "parse('2011-01-03')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "datetime.datetime(2011, 12, 6, 0, 0)"
      ]
     },
     "execution_count": 14,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "parse('6/12/2011',dayfirst=True)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "datetime.datetime(1997, 1, 31, 22, 45)"
      ]
     },
     "execution_count": 15,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "parse('Jan 31, 1997 10:45 PM')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "DatetimeIndex(['2011-07-06 12:00:00', '2012-08-09 00:00:00'], dtype='datetime64[ns]', freq=None)"
      ]
     },
     "execution_count": 16,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# pandas通常是用于处理成组日期的，不管这些日期是DataFrame的轴索引还是列。\n",
    "# to_datetime方法可以解析多种不同的日期表示形式。对标准日期格式（如ISO8601）的解析非常快\n",
    "import pandas as pd\n",
    "d = ['2011-07-06 12:00:00','2012-08-09 00:00:00']\n",
    "pd.to_datetime(d)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "DatetimeIndex(['2011-07-06 12:00:00', '2012-08-09 00:00:00', 'NaT'], dtype='datetime64[ns]', freq=None)"
      ]
     },
     "execution_count": 17,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "idx = pd.to_datetime(d+[None]) # 可以拼接另外的时间字符串\n",
    "idx"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([False, False,  True])"
      ]
     },
     "execution_count": 18,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# 其中NaT（Not a Time）是pandas中时间戳数据的null值\n",
    "pd.isnull(idx) # 处理空时间值"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "source": [
    "# 时间序列基础\n",
    "pandas最基本的时间序列类型就是以时间戳（通常以Python字符串或datatime对象表示）为索引的Series"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "2011-01-02    1.391598\n",
       "2011-01-05   -0.649087\n",
       "2011-01-07   -1.395488\n",
       "2011-01-08    1.216907\n",
       "2011-01-10    1.093867\n",
       "2011-01-12    1.014923\n",
       "dtype: float64"
      ]
     },
     "execution_count": 19,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "import numpy as np\n",
    "import pandas as pd\n",
    "from datetime import datetime\n",
    "dates = [datetime(2011, 1, 2), datetime(2011, 1, 5),\n",
    "datetime(2011, 1, 7), datetime(2011, 1, 8),\n",
    "datetime(2011, 1, 10), datetime(2011, 1, 12)]\n",
    "ts = pd.Series(np.random.randn(6), index=dates)\n",
    "ts"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "DatetimeIndex(['2011-01-02', '2011-01-05', '2011-01-07', '2011-01-08',\n",
       "               '2011-01-10', '2011-01-12'],\n",
       "              dtype='datetime64[ns]', freq=None)"
      ]
     },
     "execution_count": 20,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "ts.index\n",
    "# pandas用numpy的datetime64数据类型以纳秒形式存储时间戳"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "2011-01-02    2.783195\n",
       "2011-01-05         NaN\n",
       "2011-01-07   -2.790976\n",
       "2011-01-08         NaN\n",
       "2011-01-10    2.187735\n",
       "2011-01-12         NaN\n",
       "dtype: float64"
      ]
     },
     "execution_count": 21,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# 跟其他Series一样，不同索引的时间序列之间的算术运算会自动按日期对齐\n",
    "ts+ts[::2]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "dtype('<M8[ns]')"
      ]
     },
     "execution_count": 22,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "ts.index.dtype"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Timestamp('2011-01-02 00:00:00')"
      ]
     },
     "execution_count": 23,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "s = ts.index[0]\n",
    "s"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "np.float64(-1.3954879840678946)"
      ]
     },
     "execution_count": 24,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# 当你根据标签索引选取数据时，时间序列和其它的pandas.Series很像：\n",
    "stamp = ts.index[2]\n",
    "ts[stamp]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "np.float64(-1.3954879840678946)"
      ]
     },
     "execution_count": 25,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "ts['1/07/2011']"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 26,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "np.float64(-1.3954879840678946)"
      ]
     },
     "execution_count": 26,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "ts['20110107']"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 27,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "2000-01-01   -0.724278\n",
       "2000-01-02    1.211992\n",
       "2000-01-03   -2.676162\n",
       "2000-01-04    1.309201\n",
       "2000-01-05    0.235807\n",
       "                ...   \n",
       "2002-09-22   -0.917234\n",
       "2002-09-23   -1.458120\n",
       "2002-09-24    0.008542\n",
       "2002-09-25    0.502642\n",
       "2002-09-26    1.396480\n",
       "Freq: D, Length: 1000, dtype: float64"
      ]
     },
     "execution_count": 27,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "ts1 = pd.Series(np.random.randn(1000),index=pd.date_range('1/1/2000',periods=1000))\n",
    "ts1"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 28,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "2001-01-01    0.740514\n",
       "2001-01-02   -1.082470\n",
       "2001-01-03   -0.275387\n",
       "2001-01-04    0.513526\n",
       "2001-01-05   -0.764417\n",
       "                ...   \n",
       "2001-12-27   -0.587038\n",
       "2001-12-28   -2.028827\n",
       "2001-12-29   -1.264655\n",
       "2001-12-30   -0.463114\n",
       "2001-12-31   -0.427229\n",
       "Freq: D, Length: 365, dtype: float64"
      ]
     },
     "execution_count": 28,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "ts1['2001']"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "2001-05-01    0.418190\n",
       "2001-05-02   -1.793682\n",
       "2001-05-03   -0.237125\n",
       "2001-05-04   -0.021221\n",
       "2001-05-05    0.606820\n",
       "2001-05-06    1.338278\n",
       "2001-05-07   -1.050222\n",
       "2001-05-08    0.196252\n",
       "2001-05-09   -0.257888\n",
       "2001-05-10    0.034929\n",
       "2001-05-11    0.660898\n",
       "2001-05-12   -1.541735\n",
       "2001-05-13   -0.578517\n",
       "2001-05-14    0.490608\n",
       "2001-05-15    0.521398\n",
       "2001-05-16   -1.857809\n",
       "2001-05-17   -0.572794\n",
       "2001-05-18   -1.540529\n",
       "2001-05-19   -0.380687\n",
       "2001-05-20    1.687881\n",
       "2001-05-21   -1.176512\n",
       "2001-05-22    0.025636\n",
       "2001-05-23    0.023013\n",
       "2001-05-24   -0.140943\n",
       "2001-05-25   -0.604202\n",
       "2001-05-26   -0.851038\n",
       "2001-05-27   -0.580242\n",
       "2001-05-28    0.674737\n",
       "2001-05-29   -0.370740\n",
       "2001-05-30   -1.007100\n",
       "2001-05-31    1.331879\n",
       "Freq: D, dtype: float64"
      ]
     },
     "execution_count": 29,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "ts1['2001-05']"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 30,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "2011-01-02    1.391598\n",
       "2011-01-05   -0.649087\n",
       "2011-01-07   -1.395488\n",
       "2011-01-08    1.216907\n",
       "2011-01-10    1.093867\n",
       "2011-01-12    1.014923\n",
       "dtype: float64"
      ]
     },
     "execution_count": 30,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "ts"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 31,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "2011-01-07   -1.395488\n",
       "2011-01-08    1.216907\n",
       "2011-01-10    1.093867\n",
       "2011-01-12    1.014923\n",
       "dtype: float64"
      ]
     },
     "execution_count": 31,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "ts[datetime(2011,1,7):]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 32,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "2011-01-02    1.391598\n",
       "2011-01-05   -0.649087\n",
       "2011-01-07    1.000000\n",
       "2011-01-08    1.000000\n",
       "2011-01-10    1.000000\n",
       "2011-01-12    1.014923\n",
       "dtype: float64"
      ]
     },
     "execution_count": 32,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "ts['1/6/2011':'1/11/2011'] = 1\n",
    "ts"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 33,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "2011-01-02    1.391598\n",
       "2011-01-05   -0.649087\n",
       "2011-01-07    1.000000\n",
       "2011-01-08    1.000000\n",
       "dtype: float64"
      ]
     },
     "execution_count": 33,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "ts.truncate(after='1/9/2011')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 35,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "2011-01-10    1.000000\n",
       "2011-01-12    1.014923\n",
       "dtype: float64"
      ]
     },
     "execution_count": 35,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "ts.truncate(before='1/9/2011')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 36,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "2000-01-01    0\n",
       "2000-01-02    1\n",
       "2000-01-02    2\n",
       "2000-01-02    3\n",
       "2000-01-03    4\n",
       "dtype: int64"
      ]
     },
     "execution_count": 36,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# 带有重复索引的时间序列\n",
    "dates = pd.DatetimeIndex(['1/1/2000', '1/2/2000', '1/2/2000','1/2/2000', '1/3/2000'])\n",
    "ts2 = pd.Series(np.arange(5), index=dates)\n",
    "ts2"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 37,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "False"
      ]
     },
     "execution_count": 37,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "ts2.index.is_unique"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 38,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "np.int64(4)"
      ]
     },
     "execution_count": 38,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "ts2['1/3/2000']"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 39,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "2000-01-02    1\n",
       "2000-01-02    2\n",
       "2000-01-02    3\n",
       "dtype: int64"
      ]
     },
     "execution_count": 39,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "ts2['1/2/2000']"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 40,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "2000-01-01    0.0\n",
       "2000-01-02    2.0\n",
       "2000-01-03    4.0\n",
       "dtype: float64"
      ]
     },
     "execution_count": 40,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "ts2.groupby(level=0).mean()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 41,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "2000-01-01    1\n",
       "2000-01-02    3\n",
       "2000-01-03    1\n",
       "dtype: int64"
      ]
     },
     "execution_count": 41,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "ts2.groupby(level=0).count()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "source": [
    "# 日期的范围、频率以及移动\n",
    "pandas中的原生时间序列一般被认为是不规则的，也就是说，它们没有固定的频率。对于大部分应用程序而言，这是无所谓的。但是，它常常需要以某种相对固定的频率进行分析，比如每日、每月、每15分钟等（这样自然会在时间序列中引入缺失值）。幸运的是，pandas有一整套标准时间序列频率以及用于重采样、频率推断、生成固定频率日期范围的工具。例如，我们可以将之前那个时间序列转换为一个具有固定频率（每日）的时间序列，只需调用resample即可。\n",
    "\n",
    "**生成日期范围**\n",
    "默认情况下，date_range会产生按天计算的时间点。如果只传入起始或结束日期，那就还得传入一个表示一段时间的数字。\n",
    "起始和结束日期定义了日期索引的严格边界。例如，如果你想要生成一个由每月最后一个工作日组成的日期索引，可以传入\"BM\"频率（表示business end of month），这样就只会包含时间间隔内（或刚好在边界上的）符合频率要求的日期。\n",
    "<img alt=\"Image Description\" height=\"300\" src=\"https://cdn.nlark.com/yuque/0/2021/png/1862552/1638194806833-2d567b3e-0bba-442d-be34-344a6d29fad2.png\" width=\"500\"/>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 42,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "DatetimeIndex(['2012-04-01', '2012-04-02', '2012-04-03', '2012-04-04',\n",
       "               '2012-04-05', '2012-04-06', '2012-04-07', '2012-04-08',\n",
       "               '2012-04-09', '2012-04-10', '2012-04-11', '2012-04-12',\n",
       "               '2012-04-13', '2012-04-14', '2012-04-15', '2012-04-16',\n",
       "               '2012-04-17', '2012-04-18', '2012-04-19', '2012-04-20',\n",
       "               '2012-04-21', '2012-04-22', '2012-04-23', '2012-04-24',\n",
       "               '2012-04-25', '2012-04-26', '2012-04-27', '2012-04-28',\n",
       "               '2012-04-29', '2012-04-30', '2012-05-01', '2012-05-02',\n",
       "               '2012-05-03', '2012-05-04', '2012-05-05', '2012-05-06',\n",
       "               '2012-05-07', '2012-05-08', '2012-05-09', '2012-05-10',\n",
       "               '2012-05-11', '2012-05-12', '2012-05-13', '2012-05-14',\n",
       "               '2012-05-15', '2012-05-16', '2012-05-17', '2012-05-18',\n",
       "               '2012-05-19', '2012-05-20', '2012-05-21', '2012-05-22',\n",
       "               '2012-05-23', '2012-05-24', '2012-05-25', '2012-05-26',\n",
       "               '2012-05-27', '2012-05-28', '2012-05-29', '2012-05-30',\n",
       "               '2012-05-31', '2012-06-01'],\n",
       "              dtype='datetime64[ns]', freq='D')"
      ]
     },
     "execution_count": 42,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "index = pd.date_range('2012-04-01','2012-06-01')\n",
    "index"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 43,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "DatetimeIndex(['2012-04-01', '2012-04-02', '2012-04-03', '2012-04-04',\n",
       "               '2012-04-05', '2012-04-06', '2012-04-07', '2012-04-08',\n",
       "               '2012-04-09', '2012-04-10', '2012-04-11', '2012-04-12',\n",
       "               '2012-04-13', '2012-04-14', '2012-04-15', '2012-04-16',\n",
       "               '2012-04-17', '2012-04-18', '2012-04-19', '2012-04-20'],\n",
       "              dtype='datetime64[ns]', freq='D')"
      ]
     },
     "execution_count": 43,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "pd.date_range('2012-04-01',periods=20)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 44,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "DatetimeIndex(['2012-03-13', '2012-03-14', '2012-03-15', '2012-03-16',\n",
       "               '2012-03-17', '2012-03-18', '2012-03-19', '2012-03-20',\n",
       "               '2012-03-21', '2012-03-22', '2012-03-23', '2012-03-24',\n",
       "               '2012-03-25', '2012-03-26', '2012-03-27', '2012-03-28',\n",
       "               '2012-03-29', '2012-03-30', '2012-03-31', '2012-04-01'],\n",
       "              dtype='datetime64[ns]', freq='D')"
      ]
     },
     "execution_count": 44,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "pd.date_range(end='2012-04-01',periods=20)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 45,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "C:\\Users\\98614\\AppData\\Local\\Temp\\ipykernel_1496\\1518776410.py:1: FutureWarning: 'M' is deprecated and will be removed in a future version, please use 'ME' instead.\n",
      "  pd.date_range('2012-01-01','2012-12-31',freq='M')\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "DatetimeIndex(['2012-01-31', '2012-02-29', '2012-03-31', '2012-04-30',\n",
       "               '2012-05-31', '2012-06-30', '2012-07-31', '2012-08-31',\n",
       "               '2012-09-30', '2012-10-31', '2012-11-30', '2012-12-31'],\n",
       "              dtype='datetime64[ns]', freq='ME')"
      ]
     },
     "execution_count": 45,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "pd.date_range('2012-01-01','2012-12-31',freq='M')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "source": [
    "**频率和日期偏移量**\n",
    "pandas中的频率是由一个基础频率（base frequency）和一个乘数组成的。基础频率通常以一个字符串别名表示，比如\"M\"表示每月，\"H\"表示每小时。对于每个基础频率，都有一个被称为日期偏移量（date offset）的对象与之对应。例如，按小时计算的频率可以用Hour类表示"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 46,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "<Hour>"
      ]
     },
     "execution_count": 46,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "from pandas.tseries.offsets import Hour, Minute\n",
    "hour = Hour()\n",
    "hour"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 47,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "<4 * Hours>"
      ]
     },
     "execution_count": 47,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "h = Hour(4)\n",
    "h"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 48,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "DatetimeIndex(['2000-01-01 00:00:00', '2000-01-01 04:00:00',\n",
       "               '2000-01-01 08:00:00', '2000-01-01 12:00:00',\n",
       "               '2000-01-01 16:00:00', '2000-01-01 20:00:00',\n",
       "               '2000-01-02 00:00:00'],\n",
       "              dtype='datetime64[ns]', freq='4h')"
      ]
     },
     "execution_count": 48,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "pd.date_range('2000-01-01','2000-01-02',freq='4h')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 49,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "DatetimeIndex(['2000-01-01 00:00:00', '2000-01-01 02:30:00',\n",
       "               '2000-01-01 05:00:00', '2000-01-01 07:30:00',\n",
       "               '2000-01-01 10:00:00', '2000-01-01 12:30:00',\n",
       "               '2000-01-01 15:00:00', '2000-01-01 17:30:00',\n",
       "               '2000-01-01 20:00:00', '2000-01-01 22:30:00'],\n",
       "              dtype='datetime64[ns]', freq='150min')"
      ]
     },
     "execution_count": 49,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "pd.date_range('2000-01-01','2000-01-02',freq='2h30min')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "source": [
    "**移动数据**\n",
    "移动（shifting）指的是沿着时间轴将数据前移或后移。Series和DataFrame都有一个shift方法用于执行单纯的前移或后移操作，保持索引不变。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 50,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "C:\\Users\\98614\\AppData\\Local\\Temp\\ipykernel_1496\\604801220.py:2: FutureWarning: 'M' is deprecated and will be removed in a future version, please use 'ME' instead.\n",
      "  index=pd.date_range('1/1/2000', periods=4, freq='M'))\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "2000-01-31   -0.122270\n",
       "2000-02-29   -1.263070\n",
       "2000-03-31   -1.951354\n",
       "2000-04-30    0.058045\n",
       "Freq: ME, dtype: float64"
      ]
     },
     "execution_count": 50,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "ts = pd.Series(np.random.randn(4),\n",
    "                index=pd.date_range('1/1/2000', periods=4, freq='M'))\n",
    "ts"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 51,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "2000-01-31        NaN\n",
       "2000-02-29        NaN\n",
       "2000-03-31   -0.12227\n",
       "2000-04-30   -1.26307\n",
       "Freq: ME, dtype: float64"
      ]
     },
     "execution_count": 51,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "ts.shift(2)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 52,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "2000-01-31   -1.951354\n",
       "2000-02-29    0.058045\n",
       "2000-03-31         NaN\n",
       "2000-04-30         NaN\n",
       "Freq: ME, dtype: float64"
      ]
     },
     "execution_count": 52,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "ts.shift(-2)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 53,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "C:\\Users\\98614\\AppData\\Local\\Temp\\ipykernel_1496\\66903024.py:1: FutureWarning: 'M' is deprecated and will be removed in a future version, please use 'ME' instead.\n",
      "  ts.shift(2,freq='M')\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "2000-03-31   -0.122270\n",
       "2000-04-30   -1.263070\n",
       "2000-05-31   -1.951354\n",
       "2000-06-30    0.058045\n",
       "Freq: ME, dtype: float64"
      ]
     },
     "execution_count": 53,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "ts.shift(2,freq='M')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "source": [
    "# 重采样\n",
    "重采样（resampling）指的是将时间序列从一个频率转换到另一个频率的处理过程。将高频率数据聚合到低频率称为降采样（downsampling），而将低频率数据转换到高频率则称为升采样（upsampling）。\n",
    "并不是所有的重采样都能被划分到这两个大类中,例如，将W-WED（每周三）转换为W-FRI既不是降采样也不是升采样。\n",
    "\n",
    "<img alt=\"Image Description\" height=\"300\" src=\"https://cdn.nlark.com/yuque/0/2021/png/1862552/1638194854355-672fba8a-720f-40d3-b35c-70991507d069.png\" width=\"1000\" height=\"800\"/>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 54,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>0</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>2017-01-01</th>\n",
       "      <td>17.752502</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2017-01-02</th>\n",
       "      <td>33.890379</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2017-01-03</th>\n",
       "      <td>10.166185</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2017-01-04</th>\n",
       "      <td>47.227356</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2017-01-05</th>\n",
       "      <td>46.544299</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2017-04-06</th>\n",
       "      <td>10.332748</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2017-04-07</th>\n",
       "      <td>16.334292</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2017-04-08</th>\n",
       "      <td>49.021301</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2017-04-09</th>\n",
       "      <td>19.796007</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2017-04-10</th>\n",
       "      <td>33.970059</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>100 rows × 1 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "                    0\n",
       "2017-01-01  17.752502\n",
       "2017-01-02  33.890379\n",
       "2017-01-03  10.166185\n",
       "2017-01-04  47.227356\n",
       "2017-01-05  46.544299\n",
       "...               ...\n",
       "2017-04-06  10.332748\n",
       "2017-04-07  16.334292\n",
       "2017-04-08  49.021301\n",
       "2017-04-09  19.796007\n",
       "2017-04-10  33.970059\n",
       "\n",
       "[100 rows x 1 columns]"
      ]
     },
     "execution_count": 54,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "t = pd.DataFrame(np.random.uniform(10,50,(100,1)),index=pd.date_range('20170101',periods=100))\n",
    "t"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 55,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "C:\\Users\\98614\\AppData\\Local\\Temp\\ipykernel_1496\\786118695.py:3: FutureWarning: 'M' is deprecated and will be removed in a future version, please use 'ME' instead.\n",
      "  t.resample('M').mean()\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>0</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>2017-01-31</th>\n",
       "      <td>27.212442</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2017-02-28</th>\n",
       "      <td>29.227679</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2017-03-31</th>\n",
       "      <td>30.149970</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2017-04-30</th>\n",
       "      <td>28.506865</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                    0\n",
       "2017-01-31  27.212442\n",
       "2017-02-28  29.227679\n",
       "2017-03-31  30.149970\n",
       "2017-04-30  28.506865"
      ]
     },
     "execution_count": 55,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "#resample\n",
    "#降采样\n",
    "t.resample('M').mean()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 56,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>0</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>2017-01-01</th>\n",
       "      <td>10</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2017-01-11</th>\n",
       "      <td>10</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2017-01-21</th>\n",
       "      <td>10</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2017-01-31</th>\n",
       "      <td>10</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2017-02-10</th>\n",
       "      <td>10</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2017-02-20</th>\n",
       "      <td>10</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2017-03-02</th>\n",
       "      <td>10</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2017-03-12</th>\n",
       "      <td>10</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2017-03-22</th>\n",
       "      <td>10</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2017-04-01</th>\n",
       "      <td>10</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "             0\n",
       "2017-01-01  10\n",
       "2017-01-11  10\n",
       "2017-01-21  10\n",
       "2017-01-31  10\n",
       "2017-02-10  10\n",
       "2017-02-20  10\n",
       "2017-03-02  10\n",
       "2017-03-12  10\n",
       "2017-03-22  10\n",
       "2017-04-01  10"
      ]
     },
     "execution_count": 56,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "t.resample('10D').count()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 57,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>上海</th>\n",
       "      <th>北京</th>\n",
       "      <th>深圳</th>\n",
       "      <th>广州</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>2000-01-05</th>\n",
       "      <td>0.442747</td>\n",
       "      <td>-1.361137</td>\n",
       "      <td>-0.612537</td>\n",
       "      <td>0.633960</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2000-01-12</th>\n",
       "      <td>1.662727</td>\n",
       "      <td>-0.021443</td>\n",
       "      <td>0.080045</td>\n",
       "      <td>1.476445</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                  上海        北京        深圳        广州\n",
       "2000-01-05  0.442747 -1.361137 -0.612537  0.633960\n",
       "2000-01-12  1.662727 -0.021443  0.080045  1.476445"
      ]
     },
     "execution_count": 57,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "#升采样\n",
    "frame = pd.DataFrame(np.random.randn(2, 4),\n",
    "                    index=pd.date_range('1/1/2000', periods=2,freq='W-WED'),\n",
    "                    columns=['上海', '北京', '深圳', '广州'])\n",
    "frame"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 58,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>上海</th>\n",
       "      <th>北京</th>\n",
       "      <th>深圳</th>\n",
       "      <th>广州</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>2000-01-05</th>\n",
       "      <td>0.442747</td>\n",
       "      <td>-1.361137</td>\n",
       "      <td>-0.612537</td>\n",
       "      <td>0.633960</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2000-01-06</th>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2000-01-07</th>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2000-01-08</th>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2000-01-09</th>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2000-01-10</th>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2000-01-11</th>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2000-01-12</th>\n",
       "      <td>1.662727</td>\n",
       "      <td>-0.021443</td>\n",
       "      <td>0.080045</td>\n",
       "      <td>1.476445</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                  上海        北京        深圳        广州\n",
       "2000-01-05  0.442747 -1.361137 -0.612537  0.633960\n",
       "2000-01-06       NaN       NaN       NaN       NaN\n",
       "2000-01-07       NaN       NaN       NaN       NaN\n",
       "2000-01-08       NaN       NaN       NaN       NaN\n",
       "2000-01-09       NaN       NaN       NaN       NaN\n",
       "2000-01-10       NaN       NaN       NaN       NaN\n",
       "2000-01-11       NaN       NaN       NaN       NaN\n",
       "2000-01-12  1.662727 -0.021443  0.080045  1.476445"
      ]
     },
     "execution_count": 58,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "frame.resample('D').asfreq()  #转换成高频"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 59,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>上海</th>\n",
       "      <th>北京</th>\n",
       "      <th>深圳</th>\n",
       "      <th>广州</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>2000-01-05</th>\n",
       "      <td>0.442747</td>\n",
       "      <td>-1.361137</td>\n",
       "      <td>-0.612537</td>\n",
       "      <td>0.633960</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2000-01-06</th>\n",
       "      <td>0.442747</td>\n",
       "      <td>-1.361137</td>\n",
       "      <td>-0.612537</td>\n",
       "      <td>0.633960</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2000-01-07</th>\n",
       "      <td>0.442747</td>\n",
       "      <td>-1.361137</td>\n",
       "      <td>-0.612537</td>\n",
       "      <td>0.633960</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2000-01-08</th>\n",
       "      <td>0.442747</td>\n",
       "      <td>-1.361137</td>\n",
       "      <td>-0.612537</td>\n",
       "      <td>0.633960</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2000-01-09</th>\n",
       "      <td>0.442747</td>\n",
       "      <td>-1.361137</td>\n",
       "      <td>-0.612537</td>\n",
       "      <td>0.633960</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2000-01-10</th>\n",
       "      <td>0.442747</td>\n",
       "      <td>-1.361137</td>\n",
       "      <td>-0.612537</td>\n",
       "      <td>0.633960</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2000-01-11</th>\n",
       "      <td>0.442747</td>\n",
       "      <td>-1.361137</td>\n",
       "      <td>-0.612537</td>\n",
       "      <td>0.633960</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2000-01-12</th>\n",
       "      <td>1.662727</td>\n",
       "      <td>-0.021443</td>\n",
       "      <td>0.080045</td>\n",
       "      <td>1.476445</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                  上海        北京        深圳        广州\n",
       "2000-01-05  0.442747 -1.361137 -0.612537  0.633960\n",
       "2000-01-06  0.442747 -1.361137 -0.612537  0.633960\n",
       "2000-01-07  0.442747 -1.361137 -0.612537  0.633960\n",
       "2000-01-08  0.442747 -1.361137 -0.612537  0.633960\n",
       "2000-01-09  0.442747 -1.361137 -0.612537  0.633960\n",
       "2000-01-10  0.442747 -1.361137 -0.612537  0.633960\n",
       "2000-01-11  0.442747 -1.361137 -0.612537  0.633960\n",
       "2000-01-12  1.662727 -0.021443  0.080045  1.476445"
      ]
     },
     "execution_count": 59,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "frame.resample('D').ffill()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 60,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>上海</th>\n",
       "      <th>北京</th>\n",
       "      <th>深圳</th>\n",
       "      <th>广州</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>2000-01-05</th>\n",
       "      <td>0.442747</td>\n",
       "      <td>-1.361137</td>\n",
       "      <td>-0.612537</td>\n",
       "      <td>0.633960</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2000-01-06</th>\n",
       "      <td>0.442747</td>\n",
       "      <td>-1.361137</td>\n",
       "      <td>-0.612537</td>\n",
       "      <td>0.633960</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2000-01-07</th>\n",
       "      <td>0.442747</td>\n",
       "      <td>-1.361137</td>\n",
       "      <td>-0.612537</td>\n",
       "      <td>0.633960</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2000-01-08</th>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2000-01-09</th>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2000-01-10</th>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2000-01-11</th>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2000-01-12</th>\n",
       "      <td>1.662727</td>\n",
       "      <td>-0.021443</td>\n",
       "      <td>0.080045</td>\n",
       "      <td>1.476445</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                  上海        北京        深圳        广州\n",
       "2000-01-05  0.442747 -1.361137 -0.612537  0.633960\n",
       "2000-01-06  0.442747 -1.361137 -0.612537  0.633960\n",
       "2000-01-07  0.442747 -1.361137 -0.612537  0.633960\n",
       "2000-01-08       NaN       NaN       NaN       NaN\n",
       "2000-01-09       NaN       NaN       NaN       NaN\n",
       "2000-01-10       NaN       NaN       NaN       NaN\n",
       "2000-01-11       NaN       NaN       NaN       NaN\n",
       "2000-01-12  1.662727 -0.021443  0.080045  1.476445"
      ]
     },
     "execution_count": 60,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "frame.resample('D').ffill(limit=2)"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.20"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
